Model Comparison for Breast Cancer Prognosis Based on Clinical Data

نویسندگان

  • Sabri Boughorbel
  • Rashid Al-Ali
  • Naser Elkum
  • Mansour Ebrahimi
چکیده

We compared the performance of several prediction techniques for breast cancer prognosis, based on AU-ROC performance (Area Under ROC) for different prognosis periods. The analyzed dataset contained 1,981 patients and from an initial 25 variables, the 11 most common clinical predictors were retained. We compared eight models from a wide spectrum of predictive models, namely; Generalized Linear Model (GLM), GLM-Net, Partial Least Square (PLS), Support Vector Machines (SVM), Random Forests (RF), Neural Networks, k-Nearest Neighbors (k-NN) and Boosted Trees. In order to compare these models, paired t-test was applied on the model performance differences obtained from data resampling. Random Forests, Boosted Trees, Partial Least Square and GLMNet have superior overall performance, however they are only slightly higher than the other models. The comparative analysis also allowed us to define a relative variable importance as the average of variable importance from the different models. Two sets of variables are identified from this analysis. The first includes number of positive lymph nodes, tumor size, cancer grade and estrogen receptor, all has an important influence on model predictability. The second set incudes variables related to histological parameters and treatment types. The short term vs long term contribution of the clinical variables are also analyzed from the comparative models. From the various cancer treatment plans, the combination of Chemo/Radio therapy leads to the largest impact on cancer prognosis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis

Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...

متن کامل

A Fuzzy Rule-based Expert System for the Prognosis of the Risk of Development of the Breast Cancer

Soft Computing techniques play an important role for decision in applications with imprecise and uncertain knowledge. The application of soft computing disciplines is rapidly emerging for the diagnosis and prognosis in medical applications. Between various soft computing techniques, fuzzy expert system takes advantage of fuzzy set theory to provide computing with uncertain words. In a fuzzy exp...

متن کامل

Clinical Value of Serum S100A8/A9 and CA15-3 in the Diagnosis of Breast Cancer

Background and Objective: S100A8/A9 is a heterodimer calcium-binding protein which is involved in tumor cell proliferation, adhesion and invasion, and is proposed as a biomarker for better diagnosis and prognosis in many cancers. The aim of this study was to evaluate the simultaneous serum-based level of S100A8/A9 and CA15-3 as well-illustrated cancer biomarkers, as well as the...

متن کامل

A Novel Model to Combine Clinical and Pathway-Based Transcriptomic Information for the Prognosis Prediction of Breast Cancer

Breast cancer is the most common malignancy in women worldwide. With the increasing awareness of heterogeneity in breast cancers, better prediction of breast cancer prognosis is much needed for more personalized treatment and disease management. Towards this goal, we have developed a novel computational model for breast cancer prognosis by combining the Pathway Deregulation Score (PDS) based pa...

متن کامل

A Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis

Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...

متن کامل

The Investigate Factors on Screening of the Breast Cancer Based on PEN-3 Model in Iranian Northern Women

Introduction: As much as the women`s behavior for the premature diagnosis of the breast cancer is affected by the cultural and social factors, the purpose of this study is to investigate factors associated with screening accordance with the model PEN-3. Materials and Methods: The present study was cross-sectional. The samples studied were women above 20 years and the sample size was 1416 peo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2016